82 research outputs found

    Diverse Retrieval-Augmented In-Context Learning for Dialogue State Tracking

    Full text link
    There has been significant interest in zero and few-shot learning for dialogue state tracking (DST) due to the high cost of collecting and annotating task-oriented dialogues. Recent work has demonstrated that in-context learning requires very little data and zero parameter updates, and even outperforms trained methods in the few-shot setting (Hu et al. 2022). We propose RefPyDST, which advances the state of the art with three advancements to in-context learning for DST. First, we formulate DST as a Python programming task, explicitly modeling language coreference as variable reference in Python. Second, since in-context learning depends highly on the context examples, we propose a method to retrieve a diverse set of relevant examples to improve performance. Finally, we introduce a novel re-weighting method during decoding that takes into account probabilities of competing surface forms, and produces a more accurate dialogue state prediction. We evaluate our approach using MultiWOZ and achieve state-of-the-art multi-domain joint-goal accuracy in zero and few-shot settings.Comment: 14 pages, 2 figures, to appear in Findings of the ACL 202

    Forming Trees with Treeformers

    Full text link
    Popular models such as Transformers and LSTMs use tokens as its unit of information. That is, each token is encoded into a vector representation, and those vectors are used directly in a computation. However, humans frequently consider spans of tokens (i.e., phrases) instead of their constituent tokens. In this paper we introduce Treeformer, an architecture inspired by the CKY algorithm and Transformer which learns a composition operator and pooling function in order to construct hierarchical encodings for phrases and sentences. Our extensive experiments demonstrate the benefits of incorporating a hierarchical structure into the Transformer, and show significant improvements compared to a baseline Transformer in machine translation, abstractive summarization, and various natural language understanding tasks

    Does the "most sinfully decadent cake ever" taste good? Answering Yes/No Questions from Figurative Contexts

    Full text link
    Figurative language is commonplace in natural language, and while making communication memorable and creative, can be difficult to understand. In this work, we investigate the robustness of Question Answering (QA) models on figurative text. Yes/no questions, in particular, are a useful probe of figurative language understanding capabilities of large language models. We propose FigurativeQA, a set of 1000 yes/no questions with figurative and non-figurative contexts, extracted from the domains of restaurant and product reviews. We show that state-of-the-art BERT-based QA models exhibit an average performance drop of up to 15\% points when answering questions from figurative contexts, as compared to non-figurative ones. While models like GPT-3 and ChatGPT are better at handling figurative texts, we show that further performance gains can be achieved by automatically simplifying the figurative contexts into their non-figurative (literal) counterparts. We find that the best overall model is ChatGPT with chain-of-thought prompting to generate non-figurative contexts. Our work provides a promising direction for building more robust QA models with figurative language understanding capabilities.Comment: Accepted at RANLP 202

    Structural Prediction and Mutational Analysis of the Gifsy-1 Xis Protein

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The <it>Gifsy-1 </it>phage integrates into the <it>Salmonella </it>Typhimurium chromosome via an integrase mediated, site-specific recombination mechanism. Excision of the <it>Gifsy-1 </it>phage requires three proteins, the <it>Gifsy-1 </it>integrase (Int), the <it>Gifsy-1 </it>excisionase (Xis) protein, and host encoded Integration Host Factor (IHF). The <it>Gifsy-1 xis </it>gene encodes the 94-residue <it>Gifsy-1 </it>excisionase protein that has a molecular weight of 11.2 kDa and a pI of 10.2. Electrophoretic Mobility Shift Assays (EMSA) suggested at least one region of the protein is responsible for protein-DNA interactions with a tripartite DNA binding site composed of three direct imperfect repeats.</p> <p>Results</p> <p>Here we have undertaken experiments to dissect and model the structural motifs of <it>Gifsy-1 </it>Xis necessary for its observed DNA binding activity. Diethyl sulfate mutagenesis (DES) and mutagenic PCR techniques were used to generate <it>Gifsy-1 xis </it>mutants. Mutant Xis proteins that lacked activity in vivo were purified and tested by EMSA for binding to the <it>Gifsy-1 </it>Xis <it>attP </it>attachment site. Results from mutagenesis experiments and EMSA were compared to results of structural predictions and sequence analyses.</p> <p>Conclusion</p> <p>Sequence comparisons revealed evidence for three distinct structural motifs in the <it>Gifsy-1 </it>Xis protein. Multiple sequence alignments revealed unexpected homologies between the <it>Gifsy-1 </it>Xis protein and two distinct subsets of polynucleotide binding proteins. Our data may suggest a role for the <it>Gifsy-1 </it>Xis in the regulation of the <it>Gifsy-1 </it>phage excision beyond that of DNA binding and possible interactions with the <it>Gifsy-1 </it>Int protein.</p

    The Materials Science Procedural Text Corpus: Annotating Materials Synthesis Procedures with Shallow Semantic Structures

    Full text link
    Materials science literature contains millions of materials synthesis procedures described in unstructured natural language text. Large-scale analysis of these synthesis procedures would facilitate deeper scientific understanding of materials synthesis and enable automated synthesis planning. Such analysis requires extracting structured representations of synthesis procedures from the raw text as a first step. To facilitate the training and evaluation of synthesis extraction models, we introduce a dataset of 230 synthesis procedures annotated by domain experts with labeled graphs that express the semantics of the synthesis sentences. The nodes in this graph are synthesis operations and their typed arguments, and labeled edges specify relations between the nodes. We describe this new resource in detail and highlight some specific challenges to annotating scientific text with shallow semantic structure. We make the corpus available to the community to promote further research and development of scientific information extraction systems.Comment: Accepted as a long paper at the Linguistic Annotation Workshop (LAW) at ACL 201

    CMU: Arc-Factored, Discriminative Semantic Dependency Parsing

    Get PDF
    We present an arc-factored statistical model for semantic dependency parsing, as de-fined by the SemEval 2014 Shared Task 8 on Broad-Coverage Semantic Dependency Parsing. Our entry in the open track placed second in the competition.
    • …
    corecore